Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Using Bags of Symbols for Automatic Indexing of Graphical Document Image Databases

Identifieur interne : 000F90 ( Main/Exploration ); précédent : 000F89; suivant : 000F91

Using Bags of Symbols for Automatic Indexing of Graphical Document Image Databases

Auteurs : Eugen Barbu [France] ; Pierre Héroux [France] ; Sébastien Adam [France] ; Éric Trupin [France]

Source :

RBID : ISTEX:03C7856EF4395B19A1389C74DF36265ED92B3436

Abstract

Abstract: A database is only usefull if it is associated a set of procedures allowing to retrieve relevant elements for the users’ needs. A lot of IR techniques have been developed for automatic indexing and retrieval in document databases. Most of these use indexes depending on the textual content of documents, and very few are able to handle graphical or image content without human annotation. This paper describes an approach similar to the bag of words technique for automatic indexing of graphical document image databases and different ways to consequently query these databases. In an unsupervised manner, this approach proposes a set of automatically discovered symbols that can be combined with logical operators to build queries.

Url:
DOI: 10.1007/11767978_18


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Using Bags of Symbols for Automatic Indexing of Graphical Document Image Databases</title>
<author>
<name sortKey="Barbu, Eugen" sort="Barbu, Eugen" uniqKey="Barbu E" first="Eugen" last="Barbu">Eugen Barbu</name>
</author>
<author>
<name sortKey="Heroux, Pierre" sort="Heroux, Pierre" uniqKey="Heroux P" first="Pierre" last="Héroux">Pierre Héroux</name>
</author>
<author>
<name sortKey="Adam, Sebastien" sort="Adam, Sebastien" uniqKey="Adam S" first="Sébastien" last="Adam">Sébastien Adam</name>
</author>
<author>
<name sortKey="Trupin, Eric" sort="Trupin, Eric" uniqKey="Trupin E" first="Éric" last="Trupin">Éric Trupin</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:03C7856EF4395B19A1389C74DF36265ED92B3436</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11767978_18</idno>
<idno type="url">https://api.istex.fr/document/03C7856EF4395B19A1389C74DF36265ED92B3436/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">003167</idno>
<idno type="wicri:Area/Istex/Curation">002F26</idno>
<idno type="wicri:Area/Istex/Checkpoint">000963</idno>
<idno type="wicri:doubleKey">0302-9743:2006:Barbu E:using:bags:of</idno>
<idno type="wicri:Area/Main/Merge">001007</idno>
<idno type="wicri:Area/Main/Curation">000F90</idno>
<idno type="wicri:Area/Main/Exploration">000F90</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Using Bags of Symbols for Automatic Indexing of Graphical Document Image Databases</title>
<author>
<name sortKey="Barbu, Eugen" sort="Barbu, Eugen" uniqKey="Barbu E" first="Eugen" last="Barbu">Eugen Barbu</name>
<affiliation wicri:level="4">
<country xml:lang="fr">France</country>
<wicri:regionArea>LITIS, Université de Rouen, F-76800, Saint-Etienne du Rouvray</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne du Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">France</country>
</affiliation>
</author>
<author>
<name sortKey="Heroux, Pierre" sort="Heroux, Pierre" uniqKey="Heroux P" first="Pierre" last="Héroux">Pierre Héroux</name>
<affiliation wicri:level="4">
<country xml:lang="fr">France</country>
<wicri:regionArea>LITIS, Université de Rouen, F-76800, Saint-Etienne du Rouvray</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne du Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Adam, Sebastien" sort="Adam, Sebastien" uniqKey="Adam S" first="Sébastien" last="Adam">Sébastien Adam</name>
<affiliation wicri:level="4">
<country xml:lang="fr">France</country>
<wicri:regionArea>LITIS, Université de Rouen, F-76800, Saint-Etienne du Rouvray</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne du Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author>
<name sortKey="Trupin, Eric" sort="Trupin, Eric" uniqKey="Trupin E" first="Éric" last="Trupin">Éric Trupin</name>
<affiliation wicri:level="4">
<country xml:lang="fr">France</country>
<wicri:regionArea>LITIS, Université de Rouen, F-76800, Saint-Etienne du Rouvray</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne du Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">03C7856EF4395B19A1389C74DF36265ED92B3436</idno>
<idno type="DOI">10.1007/11767978_18</idno>
<idno type="ChapterID">18</idno>
<idno type="ChapterID">Chap18</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: A database is only usefull if it is associated a set of procedures allowing to retrieve relevant elements for the users’ needs. A lot of IR techniques have been developed for automatic indexing and retrieval in document databases. Most of these use indexes depending on the textual content of documents, and very few are able to handle graphical or image content without human annotation. This paper describes an approach similar to the bag of words technique for automatic indexing of graphical document image databases and different ways to consequently query these databases. In an unsupervised manner, this approach proposes a set of automatically discovered symbols that can be combined with logical operators to build queries.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Haute-Normandie</li>
<li>Région Normandie</li>
</region>
<settlement>
<li>Saint-Etienne du Rouvray</li>
</settlement>
<orgName>
<li>Université de Rouen</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Région Normandie">
<name sortKey="Barbu, Eugen" sort="Barbu, Eugen" uniqKey="Barbu E" first="Eugen" last="Barbu">Eugen Barbu</name>
</region>
<name sortKey="Adam, Sebastien" sort="Adam, Sebastien" uniqKey="Adam S" first="Sébastien" last="Adam">Sébastien Adam</name>
<name sortKey="Barbu, Eugen" sort="Barbu, Eugen" uniqKey="Barbu E" first="Eugen" last="Barbu">Eugen Barbu</name>
<name sortKey="Heroux, Pierre" sort="Heroux, Pierre" uniqKey="Heroux P" first="Pierre" last="Héroux">Pierre Héroux</name>
<name sortKey="Trupin, Eric" sort="Trupin, Eric" uniqKey="Trupin E" first="Éric" last="Trupin">Éric Trupin</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F90 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000F90 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:03C7856EF4395B19A1389C74DF36265ED92B3436
   |texte=   Using Bags of Symbols for Automatic Indexing of Graphical Document Image Databases
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024